Simple ISA

Home Up

Home Teaching Glossary ARM Processors Supplements Prof issues About

A Simple Instruction Set Architecture

A considerable fraction of ISAs used to illustrate computer architecture texts are based on the MIPS processors. This is a 32-bit processor (64 bits in its later incarnations) with a register-to-register RISC architecture using 32 registers. The processor is elegant and makes a popular vehicle for introducing the ISA and the pipelined architecture.

Here, we are going to create out own 32-bit processor and attempt to make is even simpler ISA than MIPS. One simplification will be the structure of the proocessor. We will reduce the apparent number of signal paths to make it easer to understand the processor’s operation. Let’s begin with the ISA. Having decided to adopts a 32-bit wordlength we have to balance instruction count, the number of registers, and the length of literals.

It would be nice to have a 32-bit literal, 128 registers, and 512 instructions. This would take an instruction 32 bits + 3 x 7 bits + 9 bits = 62 bits. That’s bigger than 32 bits, so we don’t get what we want. What can we actually get?

Registers are the key. Too few registers and you are forever fetching data from slow memory. Too many and you need an extravagant number of register select bits. Consider the following:

Registers Address bits t Total bits Remaining bits

4 2 6 26

8 3 9 23

16 4 12 20

32 5 15 17

64 6 18 14

128 7 21 11

The three lines in read are the only viable options for a 32-bit op-code. Eight or fewer registers would require excessive memory traffic. More than 32 registers would either require a very short literal or a tiny number of op-codes. MIPS, SPARC and the PowerPC have 32 registers, and ARM has 16 registers. Like ARM, we will choose 16 register, which leaves us with 20 bits for an op-code and a literal.

We will use a 12-bit literal. That gives us an unsigned range of 0 to 4,095. When we do not require a second source register, we can combines those bits with the literal to give a 4 + 12 = 16-bit literal. The following figure illustrates the structure of an op-code.

We are now going to create a microarchitecture. We will use a flow-through or single-cycle structure; that is, a non-pipelined implementation. This matches typical MIPS-style structures with five stages: program counter, instruction memory, register file (read), ALU, data memory, register file (write). The register file appears twice because the read operation takes part at the beginning of a cycle and the write operation takes part at the end of a cycle.

The following figure gives part of the structure of a processor. We have omitted a few features for simplicity. In order to reduce the systems apparent complexity, some circuits that normally appear explicitly have been incorporated in functional units. There have been marked with colored circles. If you wish to see the sequence of events taking place when a data-processing instruction like ADD r0,r1,r2 is executed, click on the text in the blue box to the right.

The figure below demonstrates how the conventional depiction of he PC path had been simplified. The PC receives two inputs via a multiplexor. One is the sequential, next instruction value that is the old PC incremented by 4 bytes to point to the next instruction. The other is the branch value that consists of the incremented PC plus the offset. However, the offset is not stored exactly because you can’t store a 32-bit literal in a 32-bit op-code. First, the literal must be signed extended to 32 bits; that is, the sign-bit is repeated to create a 32-bit value. Second, because all branch addresses are aligned to a 32-bit word address, the two least significant bits of the target are always zero. There is no point in storing these two zeros in the literal field Consequently, the target address is stored as a word value and then left shifter twice to create a byte value before adding to the PC.

The logic block below provides the circuitry which is normally included in the diagrams of the organization of typical RISC ISAs. I have created a PC block (see below) that includes all this logic in order to simplify the processor’s structure and to enable the reader to better concentrate on the data flow.

The next figure adds the ability to execute register-indirect jumps. A register-indirect jump loads the program counter with the contents of a register (or even the contents of a register plus an offset). This figure provides a new path to the PC block from the output of the ALU (i.e., the result) to the PC. We now hve to include a three-way multiplexor in the PC to include next sequential address, branch address, and jump to register address.

We can now look at the control signals required to implements machine-level operations. These are labeled on the following diagram.

Click here to follow the execution of a single instruction.